Genome-scale model of Pseudomonas aeruginosa metabolism unveils virulence and drug potentiation

Pseudomonas aeruginosa is one of the leading causes of hospital-acquired infections. To decipher the metabolic mechanisms associated with virulence and antibiotic resistance, we have developed an updated genome-scale model (GEM) of P. aeruginosa. The model (iSD1509) is an extensively curated, three-compartment, and mass-and-charge balanced BiGG model containing 1509 genes, the largest gene content for any P. aeruginosa GEM to date. It is the most accurate with prediction accuracies as high as 92.4% (gene essentiality) and 93.5% (substrate utilization). In iSD1509, we newly added a recently discovered pathway for ubiquinone-9 biosynthesis which is required for anaerobic growth. We used a modified iSD1509 to demonstrate the role of virulence factor (phenazines) in the pathogen survival within biofilm/oxygen-limited condition. Further, the model can mechanistically explain the overproduction of a drug susceptibility biomarker in the P. aeruginosa mutants. Finally, we use iSD1509 to demonstrate the drug potentiation by metabolite supplementation, and elucidate the mechanisms behind the phenotype, which agree with experimental results.


Initial reconstruction and manual refinement 25
We used CarveMe 1 to create a draft reconstruction that contained 2196 reactions and 1482 26 genes. Following this, we added the biomass reaction from the previous SEED model 27 (iPau1129 2 ). We gap-filled our model in LB rich medium to produce all the biomass constituents 28 by adding the reactions from iPau1129 or from other BiGG models. 29 Next, we curated our model through a semi-automated approach. We chose seventeen 30 different BiGG models to perform a bidirectional blast hit for each of the reactions present in 31 the draft model (Supplementary Table 2). We then manually checked the reactions for any 32 discrepancy in the gene-protein-reaction (GPR) associations made by CarveMe and by our 33 approach. Any discrepancy was resolved using the knowledge from other databases (e.g., IMG 3 , 34 KEGG 4 ). Reactions removed from the model are provided (Supplementary Data 1). 35 Using carbon substrate essentiality 2 and gene essentiality data 5 , we iteratively filled 36 more knowledge gaps either through annotation-or literature-derived information. A list of 37 reactions that were added to the model is provided (Supplementary Data 2). We also utilized 38 gene essentiality data to fix GPR associations. Likewise, biomass reaction was also modified 39 during this process to better reflect the gene essentiality data. We next added any extra genes in iPau1129 not present in our model. This process led 41 to the addition of reactions related to alginate production, rhamnolipid production, pyochelin 42 production, etc. Only 44 genes from iPau1129 are missing in the current model (Supplementary 43 Data 3). Finally, we checked and corrected any mass-and-charge imbalance, first by applying 44 the knowledge from well-curated BiGG 6 models (iJN1462 7 and iML1515 8 ), which led to only 129 45 antibiotics, indicating a reconstruction of a metabolically versatile organism (Supplementary 68

Initial Reconstruction 72
For the initial reconstruction, CarveMe 1 with gap-filling function in LB medium was utilized. 73 From the previous P. aeruginosa model (iPau1129) 2 , we added the biomass reaction along with 74 other required reactions to simulate the growth of the model on LB medium. For this process, 75 we first converted the ModelSEED 11 metabolite identifiers of iPau1129 to BiGG identifiers using 76 custom mapping database. Furthermore, the bounds of non-growth associated reaction 77 (NGAM) were also added from iPau1129. Then, we removed all the artificial sink and demand 78 reactions added by CarveMe by making sure that the respective metabolites can be produced 79 or consumed by added reactions which are supported by gene evidence. 80 Then, the Pseudomonas aeruginosa PA14 reactome was inspected for the gene-protein-81 reaction (GPR) associations using a custom pipeline ( Supplementary Fig. 1). First, seventeen 82 models from BiGG database and their respective protein FASTA files were downloaded 83 (Supplementary Table 2). Then, we performed a bidirectional blast hit (BBH) for all the 84 reactions present in the model by prioritizing the strains that are taxonomically closer to PA14 85 strain. We applied a stringent method such that only the top hits in both directions were 86 considered as the correct genes. If not, we manually inspected the top blast hits in different 87 databases including IMG 3 and KEGG 4 . These GPR associations were then used as alternatives to 88 CarveMe predictions. We performed a rigorous manual check for the reactions whose GPR 89 associations were derived from the BBH approach from models other than those from 90 Pseudomonas putida. For any discrepancy between GPR associations from CarveMe and those 91 from BBH approach, we checked IMG and/or KEGG and/or iPau1129 model to assign correct 92 gene associations. Any reactions that did not have associated GPRs or with no evidence to be 93 present in PA14 were discarded in this process. We also removed unnecessary reactions that 94 lead to cyclic production and consumption of metabolites during this process. The list of 95 deleted reactions is provided (Supplementary Data 1). We iteratively improved and validated 96 the model using substrate utilization 2 and gene essentiality 5 data which led to the modification 97 of the biomass reaction and more changes to GPR associations. We also added any genes from 98 iPau1129 that were not associated with the model at that time. 99 100

Reaction Mass and Charge Balance 101
For mass-and-charge balance, metabolite formula and charge were assigned using iJN1462 first, 102 and then iML1515. Next, we manually checked the remaining metabolites on multiple 103 databases including BiGG 6 , MetaNetX 9 , PubChem 10 or ModelSEED 11 to identify the correct 104 formula and charge. Finally, if the information for metabolites could not be found in the 105 aforementioned databases, we assigned the formula and charge by balancing the reactions that 106 contain only those metabolites as the sole undetermined ones. 107

Medium formulation 117
Six different media were used in the study. LB and SCFM composition and uptake bounds were 118 provided by Papin Lab. Likewise, a minimal medium composition was also generously provided 119 by Papin Lab. By default, these media and their respective flux constraints were used unless 120 stated otherwise. For case studies, the minimal medium composition derived from the 121 respective studies was used 13, 14, 15 . The fluxes of minimal medium provided by the Papin lab 122 were used as constraints on these three minimal media. 123

M9 (Current Protocol): The medium was derived from Current Protocols in Molecular 124
Biology 13 which was provided by Brad Poulsen from Eric Lander's lab. This medium was 125 used for core essential genes assessment study and gluconate production study.  3. M9 (Meylan): The medium was derived from Meylan et al. 15 . 129 In the minimal medium, ferrous ion (Fe 2+ ) was added because the model cannot simulate 130 growth without the ion. For further details on the medium and the uptake rates used, please 131 see Supplementary Data 5. 132

Substrate Utilization Screening and Gene Knockout Study 133
For the initial carbon source catabolic activity and gene essentiality validation, the same 134 datasets used in Bartell  assessing the gene essentiality, we used the dataset from a recent study by Poulsen et al. 17 . In 150 this dataset, core essential genes were defined as essential genes in five media conditions and 151 nine different strains. We analyzed these core essential genes in LB medium, SCFM (Synthetic 152 Cystic Fibrosis Medium) and glucose minimal medium. For gene essentiality comparison in 153 glucose minimal medium, iron had to be added even though the media used in the study 17 154 presumably did not contain iron.

Sensitivity analysis for pyocyanin production rate computation 167
Oxygen uptake rate was gradually increased from 0 mmol.gDW -1 hr -1 to 20 mmol.gDW -1 168 hr -1 in step 0.01 mmol.gDW -1 hr -1 . The pyocyanin synthesis flux was computed for this range to 169 determine the range of oxygen uptake rates to use for future simulations. 170 171